Identifying synonymy between relational phrases using word embeddings

نویسندگان

  • Nhung T. H. Nguyen
  • Makoto Miwa
  • Yoshimasa Tsuruoka
  • Satoshi Tojo
چکیده

Many text mining applications in the biomedical domain benefit from automatic clustering of relational phrases into synonymous groups, since it alleviates the problem of spurious mismatches caused by the diversity of natural language expressions. Most of the previous work that has addressed this task of synonymy resolution uses similarity metrics between relational phrases based on textual strings or dependency paths, which, for the most part, ignore the context around the relations. To overcome this shortcoming, we employ a word embedding technique to encode relational phrases. We then apply the k-means algorithm on top of the distributional representations to cluster the phrases. Our experimental results show that this approach outperforms state-of-the-art statistical models including latent Dirichlet allocation and Markov logic networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy

Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically,...

متن کامل

Learning Compositionality Functions on Word Embeddings for Modelling Attribute Meaning in Adjective-Noun Phrases

Word embeddings have been shown to be highly effective in a variety of lexical semantic tasks. They tend to capture meaningful relational similarities between individual words, at the expense of lacking the capabilty of making the underlying semantic relation explicit. In this paper, we investigate the attribute relation that often holds between the constituents of adjective-noun phrases. We us...

متن کامل

Enhanced Word Representations for Bridging Anaphora Resolution

Most current models of word representations (e.g., GloVe) have successfully captured finegrained semantics. However, semantic similarity exhibited in these word embeddings is not suitable for resolving bridging anaphora, which requires the knowledge of associative similarity (i.e., relatedness) instead of semantic similarity information between synonyms or hypernyms. We create word embeddings (...

متن کامل

An Exploration of Embeddings for Generalized Phrases

Deep learning embeddings have been successfully used for many natural language processing problems. Embeddings are mostly computed for word forms although lots of recent papers have extended this to other linguistic units like morphemes and word sequences. In this paper, we define the concept of generalized phrase that includes conventional linguistic phrases as well as skip-bigrams. We compute...

متن کامل

Identifying Features from Opinion Mining Using Fine-Grained Relational Topic Weighted Approach

-Opinion feature extraction is a sub problem of opinion mining analyzed at document, sentence, or even phrase (word) levels. Document-level (sentence-level) opinion mining is classified as overall subjectivity or sentiment, expressed in an individual review document. The existing approaches to opinion feature extraction depended on mining patterns from a particular evaluate corpus disregard non...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of biomedical informatics

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2015